Magnetic Resonance Fingerprinting (MRF) is an efficient quantitative MRI technique that can extract important tissue and system parameters such as T1, T2, B0, and B1 from a single scan. This property also makes it attractive for retrospectively synthesizing contrast-weighted images. In general, contrast-weighted images like T1-weighted, T2-weighted, etc., can be synthesized directly from parameter maps through spin-dynamics simulation (i.e., Bloch or Extended Phase Graph models). However, these approaches often exhibit artifacts due to imperfections in the mapping, the sequence modeling, and the data acquisition. Here we propose a supervised learning-based method that directly synthesizes contrast-weighted images from the MRF data without going through the quantitative mapping and spin-dynamics simulation. To implement our direct contrast synthesis (DCS) method, we deploy a conditional Generative Adversarial Network (GAN) framework and propose a multi-branch U-Net as the generator. The input MRF data are used to directly synthesize T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images through supervised training on paired MRF and target spin echo-based contrast-weighted scans. In-vivo experiments demonstrate excellent image quality compared to simulation-based contrast synthesis and previous DCS methods, both visually as well as by quantitative metrics. We also demonstrate cases where our trained model is able to mitigate in-flow and spiral off-resonance artifacts that are typically seen in MRF reconstructions and thus more faithfully represent conventional spin echo-based contrast-weighted images.
translated by 谷歌翻译
Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation.
translated by 谷歌翻译
Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP). Towards this goal, recent works have begun to train language models on narrative datasets which require extracting the most critical information by integrating across long contexts. However, it is still an open question whether these models are learning a deeper understanding of the text, or if the models are simply learning a heuristic to complete the task. This work investigates this further by turning to the one language processing system that truly understands complex language: the human brain. We show that training language models for deeper narrative understanding results in richer representations that have improved alignment to human brain activity. We further find that the improvements in brain alignment are larger for character names than for other discourse features, which indicates that these models are learning important narrative elements. Taken together, these results suggest that this type of training can indeed lead to deeper language understanding. These findings have consequences both for cognitive neuroscience by revealing some of the significant factors behind brain-NLP alignment, and for NLP by highlighting that understanding of long-range context can be improved beyond language modeling.
translated by 谷歌翻译
Language models have been shown to be very effective in predicting brain recordings of subjects experiencing complex language stimuli. For a deeper understanding of this alignment, it is important to understand the alignment between the detailed processing of linguistic information by the human brain versus language models. In NLP, linguistic probing tasks have revealed a hierarchy of information processing in neural language models that progresses from simple to complex with an increase in depth. On the other hand, in neuroscience, the strongest alignment with high-level language brain regions has consistently been observed in the middle layers. These findings leave an open question as to what linguistic information actually underlies the observed alignment between brains and language models. We investigate this question via a direct approach, in which we eliminate information related to specific linguistic properties in the language model representations and observe how this intervention affects the alignment with fMRI brain recordings obtained while participants listened to a story. We investigate a range of linguistic properties (surface, syntactic and semantic) and find that the elimination of each one results in a significant decrease in brain alignment across all layers of a language model. These findings provide direct evidence for the role of specific linguistic information in the alignment between brain and language models, and opens new avenues for mapping the joint information processing in both systems.
translated by 谷歌翻译
Pretrained language models that have been trained to predict the next word over billions of text documents have been shown to also significantly predict brain recordings of people comprehending language. Understanding the reasons behind the observed similarities between language in machines and language in the brain can lead to more insight into both systems. Recent works suggest that the prediction of the next word is a key mechanism that contributes to the alignment between the two. What is not yet understood is whether prediction of the next word is necessary for this observed alignment or simply sufficient, and whether there are other shared mechanisms or information that is similarly important. In this work, we take a first step towards a better understanding via two simple perturbations in a popular pretrained language model. The first perturbation is to improve the model's ability to predict the next word in the specific naturalistic stimulus text that the brain recordings correspond to. We show that this indeed improves the alignment with the brain recordings. However, this improved alignment may also be due to any improved word-level or multi-word level semantics for the specific world that is described by the stimulus narrative. We aim to disentangle the contribution of next word prediction and semantic knowledge via our second perturbation: scrambling the word order at inference time, which reduces the ability to predict the next word, but maintains any newly learned word-level semantics. By comparing the alignment with brain recordings of these differently perturbed models, we show that improvements in alignment with brain recordings are due to more than improvements in next word prediction and word-level semantics.
translated by 谷歌翻译
Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.
translated by 谷歌翻译
电子邮件网络钓鱼变得越来越普遍,随着时间的流逝,网络钓鱼变得更加复杂。为了打击这一上升,已经开发了许多用于检测网络钓鱼电子邮件的机器学习(ML)算法。但是,由于这些算法训练的电子邮件数据集有限,因此它们不擅长识别各种攻击,因此遭受了概念漂移的困扰。攻击者可以在其电子邮件或网站的统计特征上引入小小的变化,以成功绕过检测。随着时间的流逝,文献所报告的准确性与算法在现实世界中的实际有效性之间存在差距。这以频繁的假阳性和假阴性分类意识到自己。为此,我们建议对电子邮件进行多维风险评估,以减少攻击者调整电子邮件并避免检测的可行性。这种横向发送网络钓鱼检测配置文件的水平方法在其主要功能上发出了传入的电子邮件。我们开发了一个风险评估框架,其中包括三个模型,分析了电子邮件(1)威胁级别,(2)认知操纵和(3)电子邮件类型,我们合并了这些电子邮件类型以返回最终的风险评估评分。剖面人员不需要大量的数据集进行训练以有效,其对电子邮件功能的分析会减少概念漂移的影响。我们的参考器可以与ML方法结合使用,以减少其错误分类或作为培训阶段中大型电子邮件数据集的标签。我们在9000个合法的数据集中,使用最先进的ML算法评估了剖面人员对机器学习合奏的功效,并从一个大型澳大利亚大型研究组织的900个网络钓鱼电子邮件中进行了效力。我们的结果表明,探查者的概念漂移的影响减少了30%的假阳性,对ML合奏方法的虚假负面电子邮件分类少25%。
translated by 谷歌翻译
在线时装购物的普及不断增长。向客户提供有效建议的能力变得越来越重要。在这项工作中,我们专注于时尚服装挑战赛,这是Sigir 2022电子商务研讨会的一部分。挑战集中在填写空白(FITB)任务上,这意味着预测不完整的服装和候选人列表。在本文中,我们专注于在任务上应用暹罗网络。更具体地说,我们探讨了如何将来自多种模式的信息(文本和视觉模式)组合在一起,会影响模型在任务上的性能。我们在挑战组织者提供的测试分配中评估了我们的模型,并通过在开发阶段创建的黄金分配进行测试分配。我们发现,使用视觉和视觉和文本数据同时展示了任务的有希望的结果。我们通过建议进一步改进我们的方法来结束。
translated by 谷歌翻译
电子商务提供丰富的多模式数据,几乎没有在实践中杠杆。此数据的一个方面是用于搜索和推荐的类别树。然而,在实践中,在用户会话期间,在给定类别的文本和视觉表示之间通常存在不匹配。出现问题的激励,我们介绍了电子商务类别到图像检索的任务,并提出了任务的模型,剪辑ITA。该模型利用来自多个模式(文本,视觉和属性模态)的信息来创建产品表示。我们探索如何从多种模式(文本,视觉和属性模态)中添加信息影响模型的性能。特别是,我们观察到剪辑ITA显着优于一种可比模型,该模型仅利用可视模式和利用视觉和属性模态的可比模型。
translated by 谷歌翻译
我们解决了现实世界用户生成的离散事件序列上的自我监督学习问题。自我监督的学习将来自原始数据的复杂信息包含在低维固定长度矢量表示中,这些信息可以轻松地应用于各种下游机器学习任务中。在本文中,我们提出了一种新方法“ COLES”,该方法将以前用于音频和计算机视觉域的对比度学习适应自我监督的设置中的离散事件序列域。我们根据大型欧洲金融服务公司的交易序列部署了COLES嵌入。 COLES嵌入的用法显着提高了预先存在的模型在下游任务上的性能,并产生了巨大的财务收益,每年以数亿美元的价格衡量。我们还在几个公共事件序列数据集上评估了COLES,并表明COLES表示在不同的下游任务上始终超过其他方法。
translated by 谷歌翻译